Transmit power control techniques for wireless communication systems

ABSTRACT

Methods and systems of controlling transmit power in a multi-user wireless communication system is provided. A transmit power level for a transmitter is determined according to an algorithm based in communication theory, such as an iterative water-filling procedure, which takes into account an assumption of transmit behaviours of transmitters in the communication system. Actual transmit behaviours of transmitters in the communication system are monitored using a learning algorithm to determine whether any transmitters exhibit transmit behaviours which are not consistent with the assumption.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 11/577,280 filed on Apr. 13, 2007, which claims thebenefit of and is a National Phase Entry of PCT Application Serial No.PCT/CA2005/001565 filed on Oct. 13, 2005, which in turn claims thebenefit of U.S. Provisional Patent Application Ser. Nos. 60/617,638 and60/617,639, both filed on Oct. 13, 2004.

The entire contents of these related patent applications areincorporated in their entirety herein by reference.

FIELD OF THE INVENTION

This invention relates generally to wireless communications and, inparticular, to controlling transmitter power in wireless communicationsystems.

BACKGROUND

In the field of wireless communications, cognitive radio is viewed as anovel approach for improving the utilization of a precious naturalresource, the radio electromagnetic spectrum.

The cognitive radio, built on a software-defined radio, is defined as anintelligent wireless communication system that is aware of itsenvironment and uses the methodology of understanding-by-building tolearn from the environment and adapt to statistical variations in theinput stimuli, with two primary objectives in mind, namely highlyreliable communication whenever and wherever needed, and efficientutilization of the radio spectrum.

Attaining these objectives in cognitive radio may therefore involvecontrolling communication equipment in such a manner as to providereliable communications while reducing adverse effects on othercommunication equipment. Conventional communication techniques, however,typically exploit available communication resources for the benefit ofparticular communication equipment, one particular user for instance,without substantial consideration of impact on other communicationequipment.

SUMMARY OF THE INVENTION

According to one broad aspect of the invention, there is provided amethod of controlling transmit power in a multi-user wirelesscommunication system. The method involves determining a transmit powerlevel for a transmitter in the wireless communication system accordingto a communication theory algorithm. The communication theory algorithmis based on an assumption of behaviours of transmitters in thecommunication system. The method also involves monitoring behaviours ofthe transmitters in the communication system using a learning algorithm.

In some embodiments, the communication theory algorithm comprises aniterative water-filling procedure which accounts for determined transmitpower levels of the transmitter and other transmitters in thecommunication system.

In some embodiments, the iterative water-filling procedure involvesinitializing a transmit power distribution across n transmitters,performing water-filling for the transmitter to determine a transmitpower level for a target data transmission rate of the transmittersubject to a power constraint for the transmitter and a level ofinterference, the level of interference comprising a noise floor pluseither initialized transmit power levels or previously determinedtransmit power levels for the other transmitters, determining whether adata transmission rate of the transmitter is greater than or less than atarget data transmission rate of the transmitter, and if so, adjustingthe determined transmit power level for the transmitter, determiningwhether a target data transmission rate of at least one of the ntransmitters is not satisfied by a respective adjusted transmit powerlevel for the at least one transmitter, and repeating the operation ofperforming water-filling where the target data transmission rate of atleast one of the n transmitters is not satisfied.

In some embodiments, the operations of performing, determining whether adata transmission rate of a transmitter is greater than or less than atarget transmission rate, and adjusting are repeated for each of theother transmitters.

In some embodiments, determining whether the target data transmissionrate of at least one of the n transmitters is not satisfied comprisesdetermining whether the target data transmission rates of all of the ntransmitters are not satisfied, and repeating the operation ofperforming water-filling comprises repeating the operation of performingwater-filling where the target data transmission rates of all of the ntransmitters are not satisfied.

In some embodiments, adjusting comprises reducing the determinedtransmit power level for the transmitter where the data transmissionrate of the transmitter is greater than the target data transmissionrate of the transmitter.

In some embodiments, where the data transmission rate of the transmitteris less than the target data transmission rate of the transmitter,adjusting comprises determining whether increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit, and increasing the determined transmit power level of thetransmitter where increasing the determined transmit power level of thetransmitter would not violate an interference level limit.

In some embodiments, the interference level limit comprises aninterference temperature limit.

In some embodiments, determining whether the target data transmissionrate of at least one of the n transmitters is not satisfied comprisesdetermining whether the data transmission rate differs from the targetdata transmission rate by a predetermined amount.

In some embodiments, monitoring comprises determining whether thebehaviours of the transmitters are consistent with the assumption ofbehaviours, and the method further comprises generating an alert wherethe behaviour of one or more of the transmitters is not consistent withthe assumption.

In some embodiments, the method further comprises taking correctiveaction affecting the one or more of the transmitters responsive to thealert.

In some embodiments, the method further comprises detecting at least onespectrum hole, and determining comprises determining a transmit powerlevel for the transmitter for transmission within the at least onespectrum hole.

In some embodiments, the method further comprises predicting subsequentavailability of the at least one spectrum hole, and repeating theoperations of determining and monitoring when the at least one spectrumhole is predicted to become unavailable.

In some embodiments, the method further comprises detecting a furtherspectrum hole, detecting an increase in interference in the at least onespectrum hole, and repeating the operations of determining andmonitoring to control transmit power levels for the transmitter fortransmission within the further spectrum hole responsive to detecting anincrease in interference in the at least one spectrum hole.

In some embodiments, the method further comprises determining a positionof the transmitter relative to other transmitters in the communicationsystem.

In some embodiments, the method further comprises determining amulti-user path loss matrix of the operating environment of thetransmitter based on the determined position of the transmitter relativeto the other transmitters.

In some embodiments, the learning algorithm comprises a regret-consciouslearning algorithm.

In some embodiments, the learning algorithm comprises a Lagrangianlearning algorithm.

In some embodiments, the method further comprises adapting a modulationstrategy for transmission of data by the transmitter where the datatransmission rate of the transmitter is less than the target datatransmission rate of the transmitter and increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit.

In some embodiments, determining whether increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit comprises determining whether increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit within a spectrum hole, and the method further comprises,where the data transmission rate of the transmitter is less than thetarget data transmission rate of the transmitter and increasing thedetermined transmit power level of the transmitter would violate aninterference level limit within the spectrum hole, detecting a furtherspectrum hole, and determining a further transmit power level for thetransmitter for transmission within the new spectrum hole.

In some embodiments, a machine-readable medium stores instructions whichwhen executed perform the method.

Another broad aspect of the invention provides a system for controllingtransmit power in a multi-user wireless communication system. The systemincludes an input for receiving information associated with transmitbehaviours of transmitters in the wireless communication system, and aprocessor operatively coupled to the input. The processor is configuredto determine a transmit power level for a transmitter in the wirelesscommunication system according to a communication theory algorithm, thecommunication theory algorithm being based on an assumption of transmitbehaviours of transmitters in the communication system, and to monitorthe transmit behaviours of the transmitters in the communication systemusing a learning algorithm.

In some embodiments, the processor is further configured to implement acognitive radio.

In some embodiments, the communication theory algorithm comprises aniterative water-filling procedure, the iterative water-filling procedureaccounting for determined transmit power levels of the transmitter andother transmitters in the communication system.

In some embodiments, the processor is configured to determine a transmitpower level for the transmitter by initializing a transmit powerdistribution across n transmitters, performing water-filling for thetransmitter to determine a transmit power level for a target datatransmission rate of the transmitter subject to a power constraint forthe transmitter and a level of interference, the level of interferencecomprising a noise floor plus either initialized transmit power levelsor previously determined transmit power levels for the othertransmitters, determining whether a data transmission rate of thetransmitter is greater than or less than a target data transmission rateof the transmitter, and if so, adjusting the determined transmit powerlevel for the transmitter, determining whether a target datatransmission rate of at least one of the n transmitters is not satisfiedby a respective adjusted transmit power level for the at least onetransmitter, and repeating the operation of performing water-fillingwhere the target data transmission rate of at least one of the ntransmitters is not satisfied.

In some embodiments, the processor is further configured to determinewhether the target data transmission rate of at least one of the ntransmitters is not satisfied by determining whether the target datatransmission rates of all of the n transmitters are not satisfied, andto repeat the operation of performing water-filling where the targetdata transmission rates of all of the n transmitters are not satisfied.

In some embodiments, the processor is configured to adjust thedetermined transmit power level by reducing the determined transmitpower level for the transmitter where the data transmission rate of thetransmitter is greater than the target data transmission rate of thetransmitter.

In some embodiments, the processor is further configured to adjust thedetermined transmit power level by determining whether increasing thedetermined transmit power level of the transmitter would violate aninterference level limit, where the data transmission rate of thetransmitter is less than the target data transmission rate of thetransmitter, and increasing the determined transmit power level of thetransmitter where increasing the determined transmit power level of thetransmitter would not violate an interference level limit.

In some embodiments, the processor is configured to determine whetherthe target data transmission rate of at least one of the n transmitteris not satisfied by determining whether the data transmission ratediffers from the target data transmission rate by a predeterminedamount.

In some embodiments, the processor is configured to monitor the transmitbehaviours of the transmitters by determining whether the behaviours ofthe transmitters are consistent with the assumption of behaviours, andthe processor is further configured to generate an alert where thebehaviour of one or more of the transmitters is not consistent with theassumption.

In some embodiments, the processor is further configured to detect atleast one spectrum hole, and to determine a transmit power level bydetermining a transmit power level for the transmitter for transmissionwithin the at least one spectrum hole.

In some embodiments, the at least one spectrum hole comprises aplurality of spectrum holes, and the processor is configured todetermine a transmit power level by determining a set of transmit powerlevels comprising multiple transmit power levels for transmission withinrespective ones of the plurality of spectrum holes.

In some embodiments, the processor is further configured to predictsubsequent availability of the at least one spectrum hole, and to repeatthe operations of determining and adapting when the at least onespectrum hole is predicted to become unavailable.

In some embodiments, the processor is further configured to detect afurther spectrum hole, to detect an increase in interference in the atleast one spectrum hole, and to repeat the operations of determining andadapting to control transmit power levels for the transmitter fortransmission within the further spectrum hole responsive to detecting anincrease in interference in the at least one spectrum hole.

In some embodiments, the processor is further configured to determine aposition of the transmitter relative to the other users.

In some embodiments, the system also includes a Global PositioningSystem (GPS) receiver, and the processor is operatively coupled to theGPS receiver and configured to determine a position of the transmitterbased on signals received by the GPS receiver.

In some embodiments, the processor is further configured to determine amulti-user path loss matrix of the operating environment of thetransmitter based on the determined position of the transmitter relativeto the other transmitters.

In some embodiments, the processor is further configured to adapt amodulation strategy for transmission of data by the transmitter wherethe data transmission rate of the transmitter is less than the targetdata transmission rate of the transmitter and increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit.

In some embodiments, the processor is configured to determine whetherincreasing the determined transmit power level of the transmitter wouldviolate an interference level limit by determining whether increasingthe determined transmit power level of the transmitter would violate aninterference level limit within a spectrum hole, and the processor isfurther configured to detect a further spectrum hole and to determine afurther transmit power level for the transmitter for transmission withinthe further spectrum hole, where the data transmission rate of thetransmitter is less than the target data transmission rate of thetransmitter and increasing the determined transmit power level of thetransmitter would violate an interference level limit within thespectrum hole.

In some embodiments, the communication system comprises a Multiple-InputMultiple-Output (MIMO) communication system.

In some embodiments, the system is implemented in at least one of amobile communication device and a base station in the communicationsystem.

In some embodiments, the communication system is configured forcommunication of Orthogonal Frequency Division Multiplexing (OFDM)signals.

In some embodiments, the system is implemented in a plurality ofcommunication devices, each of the plurality of communication devicescomprising a transmitter and a processor configured to determine atransmit power level for the transmitter according to the communicationtheory algorithm, and to monitor transmit behaviours of otherscommunication devices of the plurality of communication devices usingthe learning algorithm.

Other aspects and features of embodiments of the present invention willbecome apparent to those ordinarily skilled in the art upon review ofthe following description of specific illustrative embodiments of theinvention.

BRIEF DESCRIPTION OF THE DRAWINGS

Examples of embodiments of the invention will now be described ingreater detail with reference to the accompanying drawings, in which:

FIG. 1 is a block diagram representation of a cognitive cycle;

FIG. 2 is a block diagram illustrating differences between Markovdecision processes, matrix games, and stochastic games;

FIG. 3 is a signal flow graph of a two-user communication scenario;

FIG. 4 is a plot of results of an illustrative experiment on a two-userwireless communication scenario;

FIG. 5 is a flow diagram of a transmit power control method according toan embodiment of the invention;

FIG. 6 is a block diagram of communication equipment in whichembodiments of the invention may be implemented; and

FIG. 7 is a time-frequency plot illustrating dynamic spectrum sharingfor OFDM.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The electromagnetic radio spectrum is a natural resource, the use ofwhich by transmitters and receivers is licensed by governments. InNovember 2002, the Federal Communications Commission (FCC) published aReport (ET Docket No. 02-135) prepared by the Spectrum-Policy Task Forceaimed at improving the way in which this precious resource is managed inthe United States of America. Among the Task Force Major Findings andRecommendations, the second Finding on page 3 of the Report is ratherrevealing in the context of spectrum utilization:

-   -   In many bands, spectrum access is a more significant problem        than physical scarcity of spectrum, in large part due to legacy        command-and-control regulation that limits the ability of        potential spectrum users to obtain such access.

Indeed, a scan of portions of the radio spectrum would likely find thatsome frequency bands in the spectrum are largely unoccupied most of thetime, other frequency bands are only partially occupied, and theremaining frequency bands are heavily used.

The under-utilization of the electromagnetic spectrum leads us to thinkin terms of “spectrum holes”. A spectrum hole may be generallyconsidered as a band of frequencies assigned to a primary user, which ata particular time and specific geographic location is not being utilizedby its primary user.

Spectrum utilization can be improved significantly by making it possiblefor a secondary user who is not being serviced to access a spectrum holewhich is not being utilized by the primary user at a current time andlocation of the secondary user. Cognitive radio, inclusive ofsoftware-defined radio, may offer a means to promote the efficient useof the spectrum by exploiting the existence of spectrum holes.

At this point, it may be useful to consider what is meant by “cognitiveradio”. The Encyclopedia of Computer Science (A. Ralston and E. D.Reilly, Encyclopedia of Computer Science, pp. 186-186, Van NostrandReinhold, 1993), provides a three-point computational view of cognition:

-   -   (i) mental states and processes intervene between input stimuli        and output responses;    -   (ii) the mental states and processes are described by        algorithms; and    -   (iii) the mental states and processes lend themselves to        scientific investigations.

Moreover, it may be inferred that the interdisciplinary study ofcognition is concerned with exploring general principles of intelligencethrough a synthetic methodology which is generally termed learning byunderstanding. Putting these ideas together and bearing in mind thatcognitive radio is aimed at improved utilization of the radio spectrum,the following definition for cognitive radio may be appropriate:

-   -   Cognitive radio is an intelligent wireless communication system        that is aware of its surrounding environment (i.e., outside        world), and uses the methodology of understanding-by-building to        learn from the environment and adapt its internal states to        statistical variations in incoming RF stimuli by making        corresponding changes in certain operating parameters (e.g.,        transmit-power, carrier-frequency, and modulation strategy) in        real-time, with two primary objectives in mind:        -   highly reliable communications whenever and wherever needed,            and        -   efficient utilization of the radio spectrum.

Six key words stand out in the above definition: awareness,intelligence, learning, adaptivity, reliability, and efficiency. Theawareness capability of cognitive radio, for example, may embodyawareness with respect to transmitted waveforms, radio frequency (RF)spectrum, communication network, geography, locally available services,user needs, language, situation, and security policy.

Implementation of this far-reaching combination of capabilities isindeed feasible today, thanks to the advances in digital signalprocessing, networking, machine learning, computer software, andcomputer hardware.

In addition to these cognitive capabilities, a cognitive radio is alsoendowed with reconfigurability. This latter capability is provided by aplatform known as software-defined radio, upon which a cognitive radiois built. Software-defined radio (SDR) is a practical reality today,thanks to the convergence of digital radio and computer softwaretechnologies. Reconfigurability may provide the basis for such featuresas adaptation of a radio interface so as to accommodate variations inthe development of new interface standards, incorporation of newapplications and services as they emerge, incorporation of updates insoftware technology, and exploitation of flexible services provided byradio networks, for example.

For reconfigurability, a cognitive radio looks naturally tosoftware-defined radio. For other tasks of a cognitive kind, thecognitive radio looks to signal-processing and machine-learningprocedures for their implementation. The cognitive process, inaccordance with embodiments of the invention, starts with the passivesensing of RF stimuli and culminates with action.

Cognitive radio may thus involve the following three on-line cognitivetasks. The following list includes some of the primary cognitive tasksassociated with cognitive radio, but is no way intended to beexhaustive:

-   -   (i) operating environment or radio-scene analysis, which        encompasses estimation of interference, illustratively as an        interference temperature, of the radio environment and detection        of spectrum holes;    -   (ii) channel identification, which encompasses estimation of        channel-state information (CSI), and prediction of channel        capacity for use by a transmitter; and    -   (iii) transmit-power control and dynamic spectrum management.

These three tasks form a cognitive cycle, which is pictured in one basicform in the block diagram of FIG. 1. Through interaction with the RFenvironment 10, tasks (i) and (ii), shown at 18 and 19 in FIG. 1, wouldtypically be carried out in a receiver 14, whereas task (iii), shown inFIG. 1 at 16, is carried out in a transmitter 12.

The cognitive cycle shown in FIG. 1 pertains to a one-way communicationpath, with the transmitter 12 and the receiver 14 located in twodifferent places. In a two-way communication scenario, both a receiverand a transmitter or alternatively a transceiver (i.e., a combination oftransmitter and receiver) would be provided at communication equipmentat each end of the communication path. All of the cognitive functionsembodied in the cognitive cycle of FIG. 1 are then supported at each ofa wireless communication device and a base station, for example.

From this brief discussion, it is apparent that a cognitive module inthe transmitter 12 preferably works in a harmonious manner with thecognitive modules in the receiver 14. In order to maintain this harmonybetween the cognitive radio's transmitter 12 and receiver 14 at alltimes, a feedback channel connecting the receiver 14 to the transmitter12 may be provided. Through the feedback channel, the receiver 14 isenabled to convey information on the performance of the forward link tothe transmitter 12. The cognitive radio, in one implementation, istherefore an example of a feedback communication system.

One other comment is in order. A broadly-defined cognitive radiotechnology accommodates a scale of differing degrees of cognition. Atone end of the scale, the user may simply pick a spectrum hole and buildits cognitive cycle around that hole. At the other end of the scale, theuser may employ multiple implementation technologies to build itscognitive cycle around a wideband spectrum hole or set of narrowbandspectrum holes to provide the best expected performance in terms ofspectrum management and transmit-power control, and do so in the mosthighly secure manner possible.

From a historical perspective, the development of cognitive radio isstill at a conceptual stage, unlike conventional radio. Nevertheless,cognitive radio may have the potential for making a significantdifference to the way in which the radio spectrum can be accessed withimproved utilization of the spectrum as a primary objective. Indeed,given its potential, cognitive radio can be justifiably described as adisruptive, but unobtrusive technology.

Embodiments of the present invention relate to signal-processing andadaptive procedures which lie at the heart of cognitive radio. Inparticular, the present application discloses transmit power controltechniques. Radio scene analysis in cognitive radio is described indetail in International (PCT) Patent Application Serial No.PCT/CA2005/001562, entitled “OPERATING ENVIRONMENT ANALYSIS TECHNIQUESFOR WIRELESS COMMUNICATION SYSTEMS”, filed on Oct. 13, 2005, the entirecontents of which are incorporated herein by reference, and in theabove-referenced U.S. Provisional Patent Application Ser. No.60/617,638.

In the following description, multi-user cognitive radio networks areconsidered, with a review of stochastic games highlighting the processesof cooperation and competition that characterize multi-user networks. Aniterative water-filling procedure for distributed transmit power controlis then proposed. Dynamic spectrum management, which may be performedhand-in-hand with transmit power control, is also discussed.

In conventional wireless communications built around base stations,transmit power levels are controlled by the base stations so as toprovide a required coverage area and thereby provide desired receiverperformance. On the other hand, it may be necessary for a cognitiveradio to operate in a decentralized manner, thereby broadening the scopeof its applications. In such a case, some alternative control mechanismfor transmit power may be desirable. One key issue to be considered ishow transmit power control can be achieved at the transmitter.

A partial answer to this fundamental question lies in buildingcooperative mechanisms into the way in which multiple access by users tothe cognitive radio channel is accomplished. The cooperative mechanismsmay include, for example, any of the following:

-   -   (i) Etiquette and protocol. Such provisions may be likened to        the use of traffic lights, stop signs, and speed limits, which        are intended for motorists (using a highly dense transportation        system of roads and highways) for their individual safety and        benefits.    -   (ii) Cooperative ad-hoc networks. In such networks, the users        communicate with each other without any fixed infrastructure.

In T. J. Shepard, “Decentralized Channel Management in Scalable MultihopSpread-Spectrum Packet Radio Networks”, Ph.D. Thesis, MIT, July 1995,Shepard studies a large packet radio network using spread-spectrummodulation and a cooperative mechanism of type (ii). The only requiredform of coordination in the network is pairwise, between neighboringnodes (users) that are in direct communication. To mitigateinterference, it is proposed that each node create a transmit-receiveschedule. The schedule is communicated to a nearest neighbor only when asource node's schedule and that of the neighboring node permit thesource node to transmit it and the neighboring node to receive it. Undersome reasonable assumptions, simulations are presented to show that withthis completely decentralized control, the network can scale to almostarbitrary numbers of nodes.

An independent and like-minded study (P. Gupta and P. R. Kumar, “TheCapacity of Wireless Networks”, IEEE Trans. Information Theory, Vol. 46,Issue: 2, pp. 388-404, 2000) considered a radio network consisting of nidentical nodes that communicate with each other and also use acooperative mechanism of the second type. The nodes are arbitrarilylocated inside a disk of unit area. A data packet produced by a sourcenode is transmitted to a sink node (i.e., destination) via a series ofhops across intermediate nodes in the network. If one bit-meter denotesone bit of information transmitted across a distance of one meter towardits destination, then the transport capacity of the network is definedas the total number of bit-meters that the network can transport in onesecond for all n nodes. Under a protocol model of noninterference, Guptaand Kumar derive two significant results. First, the transport capacityof the network increases with n. Second, for a node communicating withanother node at a distance nonvanishingly far away, the throughput (inbits per second) decreases with increasing n. These results areconsistent with those of Shepard. However, Gupta and Kumar do notconsider the congestion problem identified in Shepard's work.

Through the cooperative mechanisms described under (i) and (ii) andother cooperative means, the users of cognitive radio may be able tobenefit from cooperation with each other in that the system could end upbeing able to support more users because of the potential for animproved spectrum-management strategy.

The cooperative ad-hoc networks studied by Shepard and by Gupta andKumar are examples of a new generation of wireless networks, which, in aloose sense, resemble the Internet. In any event, in cognitive radioenvironments built around ad-hoc networks and existing infrastructurcdnetworks, it is possible to find the multi-user communication processbeing complicated by another phenomenon, namely competition, which worksin opposition to cooperation.

Basically, the driving force behind competition in a multi-userenvironment lies in having to operate under the umbrella of limitationsimposed on available network resources. Given such an environment, aparticular user may try to exploit the cognitive radio channel forself-enrichment in one form or another, which, in turn, may prompt otherusers to do likewise. However, exploitation via competition should notbe confused with the self-orientation of cognitive radio which involvesthe assignment of priority to certain stimuli (e.g., urgent requirementsor needs). In any event, the control of transmit power in a multi-usercognitive radio environment may operate under two stringent limitationson network resources, specifically an interference orinterference-temperature limit which might be imposed by regulatoryagencies or other entities, and the availability of a limited number ofspectrum “holes” depending on usage.

What is described above is a multi-user communication-theoretic problem.Unfortunately, a complete understanding of multi-user communicationtheory is yet to be developed. Nevertheless, we know enough about twodiverse disciplines, namely, information theory and game theory, for usto tackle this difficult problem in a meaningful way. However, beforeproceeding further, we digress briefly to introduce some basic conceptsin game theory.

The transmit-power control problems in a cognitive radio environmentinvolving multiple users may be viewed as a game-theoretic problem.

In the absence of competition, we would then have an entirelycooperative game, in which case the problem simplifies to an optimalcontrol-theoretic problem. This simplification is achieved by finding asingle cost function that is optimized by all the players, therebyeliminating the game-theoretic aspects of the problem. So, the issue ofinterest is how to deal with a non-cooperative game involving multipleplayers. To formulate a mathematical framework for such an environment,three basic realities should be accounted for:

(i) a state space that is the product of the individual players' states;

(ii) state transitions that are functions of joint actions taken by theplayers; and

(iii) payoffs to individual players that depend on joint actions aswell.

That framework is found in stochastic games, which also occasionallyappear under the name “Markov games” in computer science literature.

A stochastic game is described by the five-tuple {N,S,Ā,P, R}, where

N is a set of players, indexed 1, 2, . . . , n;

S is a set of possible state;

Ā is a joint-action space defined by the product set A₁×A₂× . . .×A_(n), where A_(j) is the set of actions available to the jth player;

P is a probabilistic transition function, an element of which for jointaction a satisfies the condition

${\sum\limits_{s^{\prime} \in S}p_{{ss}^{\prime}}^{a}} = 1$

for all s′εS and aεĀ; and

R=r₁×r₂× . . . ×r_(n), where r_(j) is the payoff for the jth player andwhich is a function of the joint actions of all n players.

One other notational issue: the action of player jεN is denoted bya_(j), while the joint actions of the other n−1 players in the set N aredenoted by a_(−j). Similar notation is used herein for some othervariables.

Stochastic games are supersets of two kinds of decision processes,namely Markov decision process and matrix games, as illustrated in FIG.2. A Markov decision process as shown at 20 is a special case of astochastic game 24 with a single player, that is, n=1. On the otherhand, a matrix game as shown at 22 is a special case of a stochasticgame 24 with a single state, that is, |S|=1.

With two or more players, often referred to as agents in machinelearning literature, being an integral part of a game, it is natural forthe study of cognitive radio to be motivated by certain ideas in gametheory. Prominent among those ideas for finite games (i.e., stochasticgames for which each player has only a finite number of alternativecourses of action) is that of a Nash equilibrium, so named for the NobelLaureate John Nash.

A Nash equilibrium is defined as an action profile (i.e., a vector ofplayers' actions) in which each action is a best response to the actionsof all the other players. According to this definition, a Nashequilibrium is a stable operating or equilibrium point in the sense thatthere is no incentive for any player involved in a finite game to changestrategy given that all the other players continue to follow theequilibrium policy. The important point to note here is that theNash-equilibrium approach provides a powerful tool for modelingnonstationary processes. Simply put, it has had an enormous influence onthe evolution of game theory by shifting its emphasis toward the studyof equilibria as a predictive concept.

With the learning process modeled as a repeated stochastic game (i.e.,repeated version of a one-shot game), each player gets to know the pastbehavior of the other players, which may influence the current decisionto be made. In such a game, the task of a player is to select the bestmixed strategy, given information on the mixed strategies of all otherplayers in the game. Hereafter, other players are referred to primarilyas “opponents”. A mixed strategy is defined as a continuousrandomization by a player of its own actions, in which the actions(i.e., pure strategies) are selected in a deterministic manner. Statedin another way, the mixed strategy of a player is a random variablewhose values are the pure strategies of that player.

To explain what we mean by a mixed strategy, let a_(j,k) denote the kthaction of player j with k=1, 2, . . . , K. The mixed strategy of playerj, denoted by the set of probabilities {p_(j,k)}_(k=1) ^(K) is anintegral part of the linear combination

$\begin{matrix}\begin{matrix}{q_{j} = {\sum\limits_{k = 1}^{K}{p_{j,k}a_{j,k}}}} & {{{{for}\mspace{14mu} j} = 1},2,\ldots \mspace{14mu},{n.}}\end{matrix} & (1)\end{matrix}$

Equivalently, we may express q_(j) as the inner product

q _(j) =p _(j) ^(T) a _(j)  (2)

for j=1, 2, . . . , n

where

p_(j)=[p_(j,1), p_(j,2), . . . , p_(j,K)]^(T) is the mixed strategyvector;

a_(j)=[a_(j,1), a_(j,2), . . . , a_(j,K)]^(T) is the deterministicaction vector; and

the superscript T denotes matrix transposition.

For all j, the elements of the mixed strategy vector p_(j) satisfy twoconditions:

0≦p _(j,k)≦1  (3)

and

$\begin{matrix}{{\sum\limits_{k = 1}^{K}p_{j,k}} = 1.} & (4)\end{matrix}$

Note also that the mixed strategies for the different players arestatistically independent.

The motivation for permitting the use of mixed strategies is thewell-known fact that every stochastic game has at least one Nashequilibrium in the space of mixed strategies but not necessarily in thespace of pure strategies, hence the preferred use of mixed strategiesover pure strategies. The purpose of a learning algorithm is that ofcomputing a mixed strategy, namely a sequence {q⁽¹⁾, q⁽²⁾, . . . ,q^((t)))} over time t.

It is also noteworthy that the implication of (1) through (4) is thatthe entire set of mixed strategies lies inside a convex simplex orconvex hull, whose dimension is K−1 and whose K vertices are thea_(j,k). Such a geometric configuration makes the selection of the bestmixed strategy in a multiple-player environment a more difficultproposition to tackle than the selection of the best base action in asingle-player environment.

The formulation of Nash equilibrium assumes that the players arerational, which means that each player has a “view of the world”. Mutualknowledge of rationality and common knowledge of beliefs may besufficient for deductive justification of the Nash equilibrium. Beliefrefers to state of the world, expressed as a set of probabilitydistributions over tests, “tests” meaning a sequence of actions andobservations that are executed at a specific time.

Despite the insightful value of the above proposed justification, thenotion of the Nash equilibrium has two practical limitations:

-   -   (i) The approach advocates the use of a best-response strategy        (i.e., a strategy whose outcome against an opponent with a        similar goal is the best possible one). But in a two-player game        for example, if one player adopts a non-equilibrium strategy,        then the optimal response of the other player is also of a        non-equilibrium kind. In such situations, the Nash-equilibrium        approach is no longer applicable.    -   (ii) Description of a non-cooperative game is essentially        confined to an equilibrium condition. Unfortunately, the        approach does not teach about the underlying dynamics involved        in establishing that equilibrium.

To refine the Nash equilibrium theory, we may embed learning models inthe formulation of game-theoretic algorithms. This new approach providesa foundation for equilibrium theory, in which less than fully rationalplayers strive for some form of optimality over time.

Statistical learning theory is a well-developed discipline for dealingwith uncertainty, which makes it well-suited for solving game-theoreticproblems. In this context, a class of no-regret algorithms is attractinga great deal of attention in the machine-learning literature.

The provision of “no-regret” is motivated by the desire to ensure twopractical end-results:

-   -   (i) A player does not get unlucky in an arbitrary nonstationary        environment. Even if the environment is not adversarial, the        player could experience bad performance when using an algorithm        that assumes independent and identically distributed (i.i.d.)        examples. The no-regret provision guarantees that such a        situation does not arise.    -   (ii) Clever opponents of that player do not exploit dynamic        changes or limited resources for their own selfish benefits.

The notion of regret can be defined in different ways. In a unifiedtreatment of game-theoretic learning algorithms, Greenwald (A.Greenwald, “Game-Theoretic Learning”, Tutorial Notes presented at theInternational Conference on Machine Learning, Banff, Alberta, July 2004)identifies three regret variations, including external regret, internalregret, and swap regret. One particular definition of no regret isbasically a rephrasing of boosting, which coincides with external regretas proposed by Greenwald. The original formulation of boosting is due toFreund and Schapire in Y. Freund and R. E. Schapire, “Adecision-theoretic generalization of on-line learning and an applicationto boosting”, Journal of Computer and System Sciences, volume 55, pp.119-139, 1997. Basically, boosting refers to the training of a committeemachine in which several experts are trained on data sets with entirelydifferent distributions. It is a general method that can be used toimprove the performance of any learning model. Stated in another way,boosting provides a method for modifying the underlying distribution ofexamples in such a way that a strong learning model is built around aset of weak learning modules.

To see how boosting can also be viewed as a no-regret proposition,consider a prediction problem with x ₁, x ₂, . . . Ā_(t-1) denoting asequence of input vectors. Let {circumflex over (x)} _(t) denote theone-step prediction at time t computed by the boosting algorithmoperating on this sequence. The prediction error is defined by thedifference ē_(t)= x _(t)− {circumflex over (x)} _(t-1). Let l(ē_(t))denote a convex cost function of the prediction error ē_(t). Themean-square error is an example of such a cost function. Afterprocessing N examples, the resulting cost function of the boostingalgorithm is given by

$\begin{matrix}{L_{N} = {\sum\limits_{t = 1}^{N}{{l( e_{t} )}.}}} & (5)\end{matrix}$

If, however, the prediction were to be performed by one of the expertsusing some fixed hypothesis h to yield the prediction error ē_(t)(h),then the corresponding cost function would have the value

$\begin{matrix}{{L_{N}(h)} = {\sum\limits_{t = 1}^{N}{{l( {e_{t}(h)} )}.}}} & (6)\end{matrix}$

The regret for not having used hypothesis h is the difference

$\begin{matrix}\begin{matrix}{{\rho_{N}(h)} = {L_{N} - {L_{N}(h)}}} \\{= {{\sum\limits_{t = 1}^{N}{l( e_{t} )}} - {{l( {e_{t}(h)} )}.}}}\end{matrix} & (7)\end{matrix}$

We say that the regret is negative if the difference ρ_(N)(h) isnegative. Let H denote the class of all hypotheses used in thealgorithm. Then the overall regret for not having used the besthypothesis hεH is given by the supremum

$\begin{matrix}{\rho_{N} = {\sup\limits_{h \in H}{{\rho_{N}(h)}.}}} & (8)\end{matrix}$

A boosting algorithm is synonymous with no-regret algorithms because theoverall regret ρ_(N), is small no matter which particular sequence ofinput vectors is presented to the algorithm.

Unfortunately, most no-regret algorithms are designed on the premisethat the hypotheses are chosen from a small, discrete set which, inturn, limits applicability of the algorithms. To overcome thislimitation, the Freund-Schapire boosting (Hedge) algorithm may beexpanded by considering a class of prediction problems with internalstructure. Specifically, the internal structure presumes two things:

-   -   (i) The input sectors are assumed to lie on or inside an almost        arbitrary convex set, so long as it is possible to perform        convex optimization. For example, we could have a d-dimensional        polyhedron or d-dimensional sphere, were d is dimensionality of        the input space.    -   (ii) The prediction rules (i.e., experts) are purposely designed        to be linear.

An example scenario that has the internal structure embodied underpoints (i) and (ii) is that of planning in a stochastic game describedby a Markov decision process, in which state-action costs are controlledby an adversarial or clever opponent after the player in question fixesits own policy. The reader is referred to H. B. McMahan, G. J. Gordon,and A. Blum, “Planning in the Presence of Cost Functions Controlled byan Adversary”, in Proceedings of the Twentieth International Conferenceon Machine Learning, Washington, D.C., 2003, for such an exampleinvolving a robot path-planning problem, which may be likened to acognitive radio problem made difficult by the actions of a cleveropponent.

Given such a framework, we can always make a legal prediction in anefficient manner via convex duality, which is an inherent property ofconvex optimization. In particular, it is always possible to choose alegal hypothesis that prevents the total regret from growing tooquickly, and therefore causes the average regret to approach zero.

By exploiting this internal structure, a new learning rule referred toas the Lagrangian hedging algorithm may be derived. This new algorithmis of a gradient descent kind, which includes two steps, namely,projection and scaling. The projection step simply ensures that wealways make a legal prediction. The scaling step adaptively adjusts thedegree to which the algorithm operates in an aggressive or conservativemanner. In particular, if the algorithm predicts poorly, then the costfunction assumes a large value on the average, which in turn tends tomake the predictions change slowly.

The algorithms derives its name from a combination of two points:

-   -   (i) The algorithm depends on one free parameter, namely, a        convex hedging function.    -   (ii) The hypothesis of interest can be viewed as a Lagrange        multiplier that keeps the regret from growing too fast.

To expand on the Lagrangian issue under point (ii), consider the case ofa matrix game using a regret-matching algorithm. Regret-matching,embodied in the so-called generalized Blackwell condition, means thatthe probability distribution over actions by a player is proportional tothe positive elements in the regret vector of that player. For example,in the so-called “rock-scissors-paper” game in which rock smashesscissors, scissors cut paper, and paper wraps the rock, if we currentlyhave a vector made up as follows:

regret 2 versus rock,

regret −7 versus scissors, and

regret 1 versus paper,

then we would play rock ⅔ of the time, never play scissors, and playpaper ⅓ of the time. The prediction at each step of the regret-matchingalgorithm is a probability distribution over actions. Ideally, we desirethe no-regret property, which means that the average regret vectorapproaches the region where all of its elements are less than or equalto zero.

However, at any finite time, in practice, the regret vector may stillhave positive elements. In such a situation, we cannot achieve theno-regret condition exactly in finite time. Rather, we apply a softconstraint by imposing a quadratic penalty function on each positiveelement of the regret vector. The penalty function involves the sum oftwo components, one being the hedging function and the other being anindicator function for the set of unnormalized hypotheses using agradient vector. The gradient vector is itself defined as the derivativeof the penalty function with respect to the regret vector, theevaluation being made at the current regret vector. It turns out thatthe gradient vector is just the regret vector with all negative elementsset equal to zero. The desired hypotheses is obtained by normalizingthis vector to form a probability distribution of actions, which yieldsexactly the regret-matching algorithm. In choosing the distribution ofactions in the manner described herein, we enforce the constraint thatthe regret vector is not allowed to move upwards along the gradient. Thequadratic penalty function cannot grow too quickly, which in turn meansthat our average gradient vector will get closer to the negativeorthant, as desired.

In short, the Lagrangian hedging algorithm is a no-regret algorithmdesigned to handle internal structure in the set of allowablepredictions. By exploiting this internal structure, tight bounds onperformance and fast rates of convergence are achieved when theprovision of no regret is of utmost importance.

As an alternative to game-theoretic learning exemplified by a no-regretalgorithm, we may look to another approach which is rooted ininformation theory, namely water-filling. To be specific, consider acognitive radio environment involving n transmitters and n receivers.The environmental model is based on two assumptions:

-   -   (i) Communication across a channel is asynchronous, in which        case the communication process can be viewed as a        non-cooperative game. For example, in a mesh network consisting        of a mixture of ad-hoc networks and existing infrastructured        networks, the communication process from a base station to users        is controlled in a synchronous manner, but the multi-hop        communication process across the ad-hoc network could be        asynchronous and therefore non-cooperative.    -   (ii) A signal-to-noise ratio (SNR) gap is included in        calculating the transmission rate so as to account for the gap        between the performance of a practical coding-modulation scheme        and the theoretical value of channel capacity. In effect, the        SNR gap is large enough to assure reliable communication under        operating conditions all the time.

In mathematical terms, the essence of transmit power control for such anoncooperative multi-user radio environment may be stated as follows:

-   -   Given a limited number of spectrum holes, select the transmit        power levels of n unserviced users so as to jointly maximize        their data-transmission rates, subject to the constraint that an        interference or interference-temperature limit is not violated.

Spectrum holes, which provide an opportunity for an unserviced user totransfer communication signals, and interference temperature as ameasure of interference, are described in detail in the above-referencedInternational (PCT) Patent Application Serial No. PCT/CA2005/001562, andU.S. Provisional Patent Application Ser. No. 60/617,638.

It may be tempting to suggest that the solution of this problem lies insimply increasing the transmit power level of each unservicedtransmitter. However, increasing the transmit power level of any onetransmitter has the undesirable effect of also increasing the level ofinterference to which the receivers of all the other transmitters aresubjected. The conclusion to be drawn from this reality is that it isnot possible to represent the overall system performance with a singleindex of performance. Rather, a tradeoff among the data rates of allunserviced users in some computationally tractable fashion should beconsidered.

Ideally, we would like to find a global solution to the constrainedmaximization of the joint set of data-transmission rates under study.Unfortunately, finding this global solution requires an exhaustivesearch through the space of all possible power allocations, in whichcase we find that the computational complexity needed for attaining theglobal solution assumes a prohibitively high level.

To overcome this computational difficulty, we use a new optimizationcriterion called competitive optimality, which is discussed in Chapter 4of the doctoral dissertation of W. Yu, “Competition and Cooperation inMulti-user Communication Environments”, Doctoral Dissertation, StanfordUniversity, 2002. In particular, Yu develops an iterative water-fillingalgorithm for a sub-optimum solution to the multi-user digitalsubscriber line (DSL) environment, viewed as a noncooperative game.

The transmit power control problem, may now be restated as follows:

-   -   Considering a multi-user cognitive radio environment viewed as a        noncooperative game, maximize the performance of each unserviced        transceiver, regardless of what all the other transceivers do,        but subject to the constraint that an interference limit not be        violated.

This formulation of the distributed transmit power control problem leadsto a solution that is of a local nature. Although sub-optimum, thesolution is insightful, as described in further detail below.

Consider the simple scenario of FIG. 3, involving two userscommunicating across a flat-fading channel. The complex-valued basebandchannel matrix is denoted by

$\begin{matrix}{H = {\begin{bmatrix}h_{11} & h_{12} \\h_{21} & h_{22}\end{bmatrix}.}} & (9)\end{matrix}$

Viewing this scenario as a non-cooperative game, we may describe the twoplayers of the game as follows:

-   -   The two players are represented by transmitters 1 and 2. In the        two-user example of FIG. 3, each user is represented by a        single-input, single-output (SISO) wireless system, hence the        adoption of transmitters 1 and 2 of the two systems as the two        players in a game-theoretic interpretation of the example. In a        MIMO generalization of this example, each user has multiple        transmitters. Nevertheless, there are still two players, with        the two players being represented by the two sets of one or more        transmitters.    -   The pure strategies (i.e., deterministic actions) of the two        players are defined by the power spectral densities S₁(f) and        S₂(f) that respectively pertain to the transmitted signals        radiated by antennas of transmitters 1 and 2.    -   The payoffs to the two players are defined by the data        transmission rates R₁ and R₂, which are respectively produced by        transmitters 1 and 2.

The noise floor of the radio frequency (RF) environment is characterizedby a frequency-dependent parameter, the power spectral density S_(N)(f).In effect, S_(N)(f) defines the “noise floor” above which the transmitpower controller must fit the transmission data requirements of users 1and 2.

The cross-coupling between the two users in terms of two new real-valuedparameters α₁ and α₂ may be defined by writing

$\begin{matrix}{\alpha_{1} = \frac{\Gamma {h_{12}}^{2}}{{h_{22}}^{2}}} & (10)\end{matrix}$

and

$\begin{matrix}{\alpha_{2} = \frac{\Gamma {h_{21}}^{2}}{{h_{11}}^{2}}} & (11)\end{matrix}$

where Γ is the signal-to-noise ratio (SNR) gap. Assuming that thereceivers do not perform any form of interference cancellationirrespective of the received signal strengths, we may respectivelyformulate the achievable data-transmission rates R₁ and R₂ as the twodefinite integrals

$\begin{matrix}{R_{1} = {\int_{{hole}\; 1}^{\;}{{\log_{2}( {1 + \frac{S_{1}(f)}{{N_{1}(f)} + {\alpha_{2}{S_{2}(f)}}}} )}{f}}}} & (12)\end{matrix}$

and

$\begin{matrix}{R_{2} = {\int_{{hole}\; 2}^{\;}{{\log_{2}( {1 + \frac{S_{2}(f)}{{N_{2}(f)} + {\alpha_{1}{S_{1}(f)}}}} )}{{f}.}}}} & (13)\end{matrix}$

The term α₂S₂(f) in the first denominator and the term α₁S₁(f) in thesecond denominator are due to the cross-coupling between thetransmitters and receivers. The remaining two terms N₁(f) and N₂(f) arenoise terms defined by

$\begin{matrix}{{N_{1}(f)} = \frac{\Gamma \; {S_{N,1}(f)}}{{h_{11}}^{2}}} & (14)\end{matrix}$

and

$\begin{matrix}{{N_{2}(f)} = \frac{\Gamma \; {S_{N,2}(f)}}{{h_{22}}^{2}}} & (15)\end{matrix}$

where S_(N,1)(f) and S_(N,2)(f) are respectively the particular parts ofthe noise-floor's spectral density S_(N)(f) that define the spectralcontents of spectrum holes 1 and 2.

We are now ready to formally state the competitive optimization problemas follows:

-   -   Given that the power spectra density S₂(f) of a transmitter 2 is        fixed, maximize the data transmission rate R₁ of transmitter 1,        subject to the constraint

∫_(hole1) [S ₁(f)+N ₁(f)+α₂ S _(N)(f)]df≦kT _(max)

where T_(max) is a prescribed interference temperature limit and k isBoltzmann's constant. A similar statement applies to the competitiveoptimization of transmitter 2.

Of course, it is understood that both S₁(f) and S₂(f) remain nonnegativefor all f. The solution to the optimization problem described hereinfollows the allocation of transmit power in accordance with awater-filling procedure.

FIG. 4 is a plot of results of an illustrative experiment on a two-userwireless communication scenario, which were obtained using thewater-filling procedure. The results presented in FIG. 4 were generatedunder the following conditions, although of course it should beappreciated that these results are intended solely for illustrativepurposes and that similar or different results may be exhibited underdifferent test or actual conditions:

-   -   Narrowband channels uniformly spaced in frequency are available        to the 2 users, as follows—user 1, channels 1, 2, and 3; user 2,        channels 4, 5, and 6.    -   Modulation strategy is OFDM.    -   Multi-user path-loss matrix

$\begin{bmatrix}0.5207 & 0 & 0 & 0.0035 & 0.0020 & 0.0024 \\0 & 0.5223 & 0 & 0.0030 & 0.0034 & 0.0031 \\0 & 0 & 0.5364 & 0.0040 & 0.0015 & 0.0035 \\0.0036 & 0.0002 & 0.0023 & 0.7136 & 0 & 0 \\0.0028 & 0.0029 & 0.0011 & 0 & 0.6945 & 0 \\0.0022 & 0.0010 & 0.0034 & 0 & 0 & 0.7312\end{bmatrix}.$

-   -   Target data transmission rates for users 1 and 2 are 9 and 12        bits/symbol, respectively.    -   Power constraint imposed by interference-temperature limit was 0        dB.    -   Receiver noise-power level=−30 dB.    -   Ambient interference power level=−24 dB.

The solution presented in FIG. 4 was reached in 2 iterations of thewater-filling algorithm. Two things are illustrated in FIG. 4:

-   -   (i) the spectrum-sharing process performed using an iterative        water-filling algorithm; and    -   (ii) the bit-loading curve at the top of the Figure.

To add meaning to the important result portrayed in FIG. 4, we may statethat the optimal competitive response to the all pure-strategycorresponds to a Nash equilibrium. Stated in another way, a Nashequilibrium is reached if, and only if, both users simultaneouslysatisfy the water-filling condition.

An assumption implicit in the water-filling solution presented in FIG. 4is that each transmitter of cognitive radio has knowledge of itsposition with respect to the receivers in its operating range at alltimes. In other words, cognitive radio has geographic awareness, whichmay be implemented by embedding a global positioning satellite (GPS)receiver in the system design, for instance.

A transmitter puts its geographic awareness to good use by calculatingthe path loss incurred in the course of electromagnetic propagation ofthe transmitted signal to each receiver in the transmitter's operatingrange, which in turn makes it possible to calculate the multi-userpath-loss matrix of the environment. Let d denote the distance from atransmitter to a receiver. Extensive measurements of the electromagneticfield strength, expressed as a function of the distance d, carried outin various radio environments have motivated an empirical propagationformula for the path loss, which expresses the received signal powerP_(R) in terms of the transmitted signal power P_(T) as

${P_{R} = {( \frac{\beta}{d^{m}} )P_{T}}},$

where the path-loss exponent m varies from 2 to 5, depending on theenvironment, and the attenuation parameter β is frequency-dependent.

Considering the general case of n transmitters indexed by i, and nreceivers indexed by j, let h_(ij) denote the complex-valued channelcoefficient from transmitter i to receiver j. Then, in light of theempirical propagation formula, we may write

${h_{ij}}^{2} = {\frac{P_{R,j}}{P_{T,i}} = \frac{\beta}{d_{ij}^{m}}}$

for i=1, 2, . . . , n and j=1, 2, . . . , n, and with d_(ij) being thedistance from transmitter i to receiver j. Hence, knowing β, m, andd_(ij) for all i and j, we may calculate the multi-user path-lossmatrix.

Emboldened by the water-filling solution illustrated in FIG. 4 for atwo-user scenario, we may formulate an iterative two-loop water-fillingalgorithm for the distributed transmit power control of a multi-userradio environment. The environment involves a set of transmittersindexed by i=1, 2, . . . , n and a corresponding set of receiversindexed by j=1, 2, . . . , n. Viewing the multi-user radio environmentas a non cooperative game and assuming the availability of an adequatenumber of spectrum holes to accommodate target data transmission rates,an iterative water-filling algorithm may proceed as follows:

-   -   (i) Initialization. Unless some prior knowledge is available,        the power distribution across the n users is set equal to zero        or some other initial value.    -   (ii) Inner loop (iteration). Given a set of allowed channels        (i.e., spectrum holes):        -   User 1 performs water-filling, subject to its power            constraint. At first, the user employs one channel, but if            its target rate is not satisfied, it may attempt to employ            two channels, and so on. The water-filling by user 1 is            performed with only the noise floor to account for.        -   User 2 performs the water-filling process, subject to its            own power constraint. At this point, in addition to the            noise floor, the water-filling computation for user 2 may            account for interference produced by user 1.        -   The power-constrained water-filling process is continued            until all n users are dealt with.    -   (iii) Outer loop (iteration). After the inner iteration is        completed, the power allocation among the n users is adjusted:        -   If the actual data transmission rate of any user is found to            be greater than its target value, the transmit power of that            user is reduced.        -   If, on the other hand, the actual data transmission rate of            any user is less than the target value, the transmit power            is increased, keeping in mind that an interference limit,            illustratively an interference temperature limit, is not            violated.    -   (iv) Confirmation. After the power adjustments, up or down, are        completed, the data transmission rates of all the n users are        checked:        -   If the target rates of all the n users are satisfied, the            computation is terminated.        -   Otherwise, the algorithm goes back to the inner loop, and            the computations are repeated. This time, however, the            water-filling performed by every user, including user 1,            preferably accounts for the interference produced by all the            other users.

In effect, the outer loop of the distributed transmit power controllertries to find the minimum level of transmit power needed to satisfy thetarget data transmission rates of all n users.

For the distributed transmit power controller to function properly, tworequirements are preferably satisfied:

-   -   Each user knows, a priori, its own target rate.    -   All target rates lie within a permissible rate region.        Otherwise, some or all of the users will violate the        interference limit.

To distributively lie within the permissible rate region, thetransmitter is preferably equipped with a centralized agent that hasknowledge of the channel capacity (through rate-feedback from thereceiver, for instance) and multi-user path-loss matrix (by virtue ofgeographic awareness). The centralized agent is thereby enabled todecide which particular sets of target rates are indeed attainable.

The iterative water-filling (WF) approach, rooted in communicationtheory, has a “top-down, dictatorially-controlled” flavor. In contrast,a no-regret algorithm, rooted in machine learning, has a “bottom-up”flavor. In more specific terms, we may make the following observations:

-   -   (i) The iterative WF algorithm exhibits fast-convergence        behavior by virtue of incorporating information on both the        channel and RF environment. On the other hand, a no-regret        algorithm exemplified by the Lagrangian hedging algorithm relies        on first-order gradient information, causing it to converge        comparatively slowly.    -   (ii) The Lagrangian hedging learner has the attractive feature        of incorporating a regret agenda, the purpose of which is to        guarantee that the learner cannot be deceptively exploited by a        clever player. On the other hand, the iterative WF algorithm        lacks a learning strategy that could enable it to guard against        exploitation.

In short, the iterative water-filling approach has much to offer fordealing with multi-user scenarios, but its performance could be improvedthrough interfacing with a more competitive, regret-consciouslearning-machine that enables it to mitigate the exploitationphenomenon.

Transmit power control techniques have been described in substantialdetail above. FIG. 5 is a flow diagram of a method according to anembodiment of the invention, in which power control techniques rooted invastly different technical fields are combined into one method ofcontrolling transmit power in a multi-user wireless communicationsystem.

As shown, the method 50 begins at 52 with an operation of determining atransmit power level for a transmitter. This determination is madeaccording to a communication theory algorithm, one example of which isan iterative water-filling procedure. The communication theory algorithmused at 52 is based on an assumption of transmit behaviours oftransmitters in the communication system.

At 54, the method 50 continues with an operation of monitoring transmitbehaviours of the transmitters in the communication system. Whereas thedetermining operation at 52 uses a communication theory algorithm, themonitoring operation at 54 uses a learning algorithm.

In one embodiment, the communication theory algorithm is an iterativewater-filling procedure which accounts for determined transmit powerlevels of the transmitter and other transmitters in the communicationsystem.

An example of such an iterative procedure has been described above, andmay begin with initializing a transmit power distribution across ntransmitters. Water-filling is then performed for the transmitter todetermine a transmit power level for a target data transmission rate ofthe transmitter subject to a power constraint for the transmitter and alevel of interference, the level of interference comprising a noisefloor plus either initialized transmit power levels or previouslydetermined transmit power levels for the other transmitters. Adetermination is then made as to whether a data transmission rate of thetransmitter is greater than or less than a target data transmission rateof the transmitter, and if so, the determined transmit power level forthe transmitter is adjusted. A similar determination is then made forother transmitters, to determine whether a target data transmission rateof at least one of the n transmitters is not satisfied by a respectiveadjusted transmit power level for the at least one transmitter. Thewater-filling is repeated where the target data transmission rate of atleast one of the n transmitters is not satisfied.

The operations of performing, determining whether a data transmissionrate of a transmitter is greater than or less than a target transmissionrate, and adjusting may be repeated for each of the other transmitters.

The operation of determining whether the target data transmission rateof at least one of the n transmitters is not satisfied may involvedetermining whether the target data transmission rates of all of the ntransmitters are not satisfied. In this case, the water-filling isrepeated where the target data transmission rates of all of the ntransmitters are not satisfied. A target data transmission rate may beconsidered not satisfied if an actual or attainable data transmissionrate differs from the target data transmission rate by a predeterminedamount.

Adjusting a transmit power level may involve reducing the determinedtransmit power level for a transmitter where the data transmission rateof the transmitter is greater than the target data transmission rate ofthe transmitter. Where the data transmission rate of the transmitter isless than the target data transmission rate of the transmitter,adjusting may involve determining whether increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit, and if not, increasing the determined transmit power levelof the transmitter.

The method 50 continues at 56 with an operation of determining whetherthe behaviours of the transmitters are consistent with the assumption ofbehaviours on which the transmit power level determination algorithm isbased. Any “misbehaving” transmitters can impact the effectiveness ofthe transmit power level determination algorithm, and thus the operationof other transmitters in the communication system. Since thetransmitters are competing for the same limited resource, a particulartransmitter should not be allowed to exploit the resource for only itsown benefit if this also negatively affects other transmitters.

One possible action which may be taken when the transmit behaviour of atransmitter is not consistent with the behaviour assumption of thetransmit power level determination algorithm would be to generate analert. Such an alert could be raised locally, at a communication devicewhere the method 50 is being performed, and/or sent to a remote deviceor system.

Corrective action may also be taken, as shown at 58. This may includeactions affecting the transmit power level determination algorithm at 52and/or actions taken by a communication network operator or serviceprovider. A possible network operator or service provider correctiveaction might be to increase communication service charges for a user ofa transmitter which does not abide by the transmit behaviourassumption(s) upon which the transmit power level determinationalgorithm is based.

As noted above, transmit power level determination in some embodimentsinvolves detecting at least one spectrum hole and determining a transmitpower level for the transmitter for transmission within the at least onespectrum hole.

It should be appreciated that the method 50 is intended solely for thepurposes of illustration. Embodiments of the invention may involvefurther, fewer, or different operations which may be performed in asimilar or different order than explicitly shown.

For example, in some embodiments, a transmit power control method alsoincludes an operation of predicting subsequent availability of adetected spectrum hole, and the operations of determining and monitoringat 52, 54 are repeated when the at least one spectrum hole is predictedto become unavailable.

Also in the context of spectrum holes, one or more further spectrumholes may be detected. The operations of determining and monitoring maythen be repeated to control transmit power levels for transmitterswithin the further spectrum hole(s), such as when an increase ininterference in a current spectrum hole is detected.

Transmit power control may also involve determining a position of thetransmitter relative to other transmitters in the communication system.A multi-user path loss matrix of the operating environment of thetransmitter may then be calculated based on the determined position ofthe transmitter relative to the other transmitters.

In terms of a system for controlling transmit power in a multi-userwireless communication system, FIG. 6 is a block diagram ofcommunication equipment in which embodiments of the invention may beimplemented. The communication equipment 60 includes a transceiver 64and one or more antennas 62 for receiving communication signals(including stimuli from interferers) from and transmitting communicationsignals to other communication equipment. Multiple antennas 62 areprovided, for example, in Multiple-Input Multiple-Output (MIMO)communication equipment. The communication equipment 60 also includes aprocessor 66 connected to the transceiver 64 and a memory 68.

Many different types of transceiver 64 and antennas 62 will be apparentto those skilled in the art. The particular types of the transceiver 64and to some extent the antennas 62 are dependent upon, for example, thetype of the communication equipment 60 and/or the communication systemin which it is intended to operate. The invention is in no way limitedto any particular type of transceiver 64 or antennas 62. Embodiments ofthe invention may be implemented, for example, in mobile communicationdevices, base stations, and/or other equipment in a wirelesscommunication system.

The processor 66 may include or be implemented as a microprocessor or adigital signal processor, for example, which is configurable to performany or all of the functions disclosed herein by executing softwarestored in the memory 68. Other functions may also be performed by theprocessor 66, such that the processor 66 is not necessarily a dedicatedprocessor. The specific implementation of the processor 66 and thememory 68, or other functional elements used in further embodiments ofthe invention, may also be dependent to some extent on the type of thecommunication equipment 60 and/or the communication system in which itis intended to operate.

In a mobile communication device, for example, the memory 68 wouldtypically include a solid state memory device, although other types ofmemory device may also or instead be provided in the communicationequipment 60.

In operation, the processor 66 is configured, by software stored in thememory 68, to determine a transmit power level for a transmitter in thetransceiver 64 according to an algorithm based in communication theory,illustratively an iterative water-filling procedure, and to monitortransmit behaviours of other transmitters using a learning algorithm.

Information associated with transmit behaviours of the othertransmitters may be received by the processor 66 through a receiver inthe transceiver 64 and the antennas) 62. More generally, the processor66 has an input which receives transmit behaviour information associatedwith other transmitters in a communication system.

The processor 66 may perform these operations substantially as describedabove. Other operations may also be performed. For example, theprocessor 66 may participate directly or indirectly in correctiveactions that affect “misbehaving” transmitters. Indirect participationmay involve detecting transmit behaviours that are not consistent withassumed conditions under which a transmit power control algorithmoperates most effectively and alerting another device or system to thisdetection. Direct participation would involve a corrective action beingperformed by the processor 66 itself.

Communication equipment may also or instead include a GPS receiver,which would allow the processor 66 to determine a position of itstransmitter based on signals received by the GPS receiver.

It should be appreciated that the present invention is in no way limitedto the particular operations or system components explicitly shown inFIGS. 5 and 6. Embodiments of the invention may include further or feweroperations or components which are performed or connected differentlythan shown in the drawings. For example, the techniques disclosed hereinmay be applied to communication equipment in which only a receiver, atransmitter, or a single antenna or sensor are provided. The variousfunctions disclosed herein may also be implemented using separatehardware, software, and/or firmware components, and need not beperformed by a single module such as the processor shown in FIG. 6.Other implementations of embodiments of the invention, as instructionsstored on a machine-readable medium, for example, are also contemplated.

It should also be appreciated that use of the term “user” herein is notintended to imply that the present invention is restricted to use inconjunction with only end user communication equipment. The techniquesdisclosed herein could be employed at end user handsets, communicationsystem base stations or other network equipment, or both.

In addition, those skilled in the art will note that the term “user” hasin some cases been used to refer to communication equipment of a user asopposed to an actual user of that equipment, in the context of amulti-user communication system or determining transmit power levels forusers.

References to “users” should be interpreted accordingly.

Dynamic spectrum management, also commonly referred to as dynamicfrequency allocation, is a process which could be performed in atransmitter. Transmit power control as described in detail above is alsoperformed in a transmitter. These two tasks are so intimately related toeach other that both may be included in a single functional module whichperforms the role of multiple-access control in the basic cognitivecycle of FIG. 1.

Simply put, one primary purpose of spectrum management is to develop anadaptive strategy for the efficient and effective utilization of the RFspectrum. Specifically, a spectrum management algorithm may build onspectrum holes detected during radio-scene analysis, as described in theco-pending application incorporate above, and the output of a transmitpower controller to select communication parameters such as a modulationstrategy that adapt to the time-varying conditions of the radioenvironment, all the time assuring reliable communication across thechannel. Communication reliability may be assured, for example, bychoosing the SNR gap F large enough a priori, as discussed above.

A modulation strategy that lends itself to cognitive radio is OFDM, byvirtue of its flexibility and computational efficiency. For itsoperation, OFDM uses a set of carrier frequencies centered on acorresponding set of narrow channel bandwidths. The availability of ratefeedback (through the use of a feedback channel) permits the use ofbit-loading, whereby the number of bits/symbol for each channel isoptimized for the signal-to-noise ratio characterizing that channel.This operation is illustrated by the uppermost curve in FIG. 4.

As time evolves and spectrum holes come and go, the bandwidth-carrierfrequency implementation of OFDM is dynamically modified, as illustratedin the time-frequency plot pictured in FIG. 7 for the case of 4 carrierfrequencies. FIG. 7 illustrates a distinctive feature of cognitiveradio: a dynamic spectrum sharing process, which evolves in time. Ineffect, the spectrum sharing process satisfies the constraint imposed oncognitive radio by the availability of spectrum holes at a particulargeographic location and their possible variability with time. Throughoutthe spectrum-sharing process, a transmit power controller may keep anaccount of the bit-loading across the spectrum holes currently in use.In effect, a dynamic spectrum manager and a transmit power controllermay work in concert together, thereby providing multiple-access control.

Starting with a set of spectrum holes, which may be detected asdescribed in the above-referenced International (PCT) Patent ApplicationSerial No. PCT/CA2005/001562, and U.S. Provisional Patent ApplicationSer. No. 60/617,638, it is possible for a dynamic spectrum managementalgorithm to confront a situation where a prescribed frame-error ratecannot be satisfied. In situations of this kind, the algorithm can doone of two things:

(i) work with a more spectrally efficient modulation strategy; or else

(ii) incorporate the use of one or more other spectrum holes.

In approach (i), the algorithm resorts to increased computationalcomplexity, and in approach (ii), it resorts to increased channelbandwidth so as to maintain communication reliability.

A dynamic spectrum management algorithm may take traffic considerationsinto account. In a code-division multiple access (CDMA) system likeIS-95, for example, there is a phenomenon called cell breathing. Cellsin the system effectively shrink and grow over time. Specifically, if acell has more users, then the interference level tends to increase,which is counteracted by allocating a new incoming user to another cell.That is, the cell coverage is reduced. If, on the other hand, a cell hasless users, then the interference level is correspondingly lowered, inwhich case the cell coverage is allowed to grow by accommodating newusers. So in a CDMA system, traffic and interference levels areassociated together. In a cognitive radio system based on CDMA, adynamic spectrum management algorithm naturally focuses on theallocation of users, first to white spaces with low interference levelsand then to grey spaces with higher interference levels.

When using other multiple-access techniques, such as OFDM, co-channelinterference should be avoided. To achieve this goal, a dynamic-spectrummanagement algorithm may include a traffic model of the primary useroccupying a portion of the spectrum. The traffic model, which could bebuilt on historical data, provides a basis for predicting future trafficpatterns in that portion of the spectrum, which in turn makes itpossible to predict the duration for which a spectrum hole vacated by aprimary user is likely to be available for use by a cognitive radiooperator.

In a wireless environment, two classes of traffic data pattern aredistinguished, including deterministic patterns and stochastic patterns.In a deterministic traffic pattern, the primary user (e.g., TVtransmitter, radar transmitter) is assigned a fixed time slot fortransmission. When it is switched OFF, the frequency band is vacated andcan therefore be used by a cognitive radio operator. Stochasticpatterns, on the other hand, can only be described in statistical terms.Typically, the arrival times of data packets are modeled as a Poissonprocess, while the service times are modeled as uniformly distributed orPoisson distributed, depending on whether the data are packet-switchedor circuit-switched, respectively. In any event, the model parameters ofstochastic traffic data vary slowly, and therefore lend themselves toon-line estimation using historical data. Moreover, by building atracking strategy into design of the predictive model, the accuracy ofthe model can be further improved.

What has been described is merely illustrative of the application of theprinciples of the invention. Other arrangements and methods can beimplemented by those skilled in the art without departing from the scopeof the present invention.

For example, it may be useful to consider possible emergent behavior ofcognitive radio.

The cognitive radio environment is naturally time varying. Mostimportant, it exhibits a unique combination of characteristicsincluding, among others, adaptivity, awareness, cooperation,competition, and exploitation. Given these characteristics, we maywonder about the emergent behavior of a cognitive radio environment inlight of what we know on two relevant fields: self-organizing systems,and evolutionary games.

First, we note that the emergent behavior of a cognitive radioenvironment viewed as a game, is influenced by the degree of couplingthat may exist between the actions of different players (i.e.,transmitters) operating in the game. The coupling may have the effect ofamplifying local perturbations in a manner analogous with Hebb'spostulate of learning, which accounts for self-amplification inself-organizing systems. Clearly, if they are left unchecked, theamplifications of local perturbations would ultimately lead toinstability. From the study of self-organizing systems, we know thatcompetition among the constituents of such a system can act as astabilizing force. By the same token, we expect that competition amongthe users of cognitive radio for limited resources (e.g., spectrumholes) may have the influence of a stabilizer.

For additional insight, we next look to evolutionary games. The idea ofevolutionary games, developed for the study of ecological biology, wasfirst introduced by Maynard Smith in 1974. In his landmark works (J.Maynard Smith, “The Theory or Games and the Evolution of AnimalConflicts”, J. Theoretical Biology, vol. 47, pp. 209-221, 1974, and J.Maynard Smith, Evolution and the Theory of Games, Cambridge UniversityPress, 1982), Maynard Smith wondered whether the theory of games couldserve as a tool for modeling conflicts in a population of animals. Inspecific terms, two critical insights into the emergence of so-calledevolutionary stable strategies were presented by Maynard Smith, assuccinctly summarized in P. W. Glimcher, Decisions, Uncertainty, and theBrain: The science of neuroeconomics, MIT Press, 2003 and H. G.Schuster, Complex Adaptive Systems: An Introduction, Springer-Verlag,2001:

-   -   The animals' behavior is stochastic and unpredictable, when it        is viewed at the microscopic level of individual acts.    -   The theory of games provides a plausible basis for explaining        the complex and unpredictable patterns of the animals' behavior.

Two key issues are raised here:

-   -   1. Complexity. The new sciences of complexity may well occupy        much of the intellectual activity in the 21st century. In the        context of complexity, it is perhaps less ambiguous to speak of        complex behavior rather than complex systems. A nonlinear        dynamic system may be complex in computational terms but        incapable of exhibiting complex behavior. By the same token, a        nonlinear system can be simple in computational terms but its        underlying dynamics are rich enough to produce complex behavior.        The emergent behavior of an evolutionary game may be complex, in        the sense that a change in one or more of the parameters in the        underlying dynamics of the game can produce a dramatic change in        behavior. Note that the dynamics must be nonlinear for complex        behavior to be possible.    -   2. Unpredictability. Game theory does not require that animals        be fundamentally unpredictable. Rather, it merely requires that        the individual behavior of each animal be unpredictable with        respect to its opponents.

From this brief discussion on evolutionary games, we may conjecture thatthe emergent behavior of a multi-user cognitive radio environment isexplained by the unpredictable action of each user, as seen individuallyby the other users (i.e., opponents).

Moreover, given the conflicting influences of cooperation, competition,and exploitation on the emergent behavior of a cognitive radioenvironment, we may identify two possible end-results:

-   -   (i) Positive emergent behavior, which is characterized by order,        and therefore a harmonious and efficient utilization of the        radio spectrum by all users of the cognitive radio. The positive        emergent behavior may be likened to Maynard Smith's evolutionary        stable strategy.    -   (ii) Negative emergent behavior, which is characterized by        disorder, and therefore a culmination of traffic jams, chaos,        and unused radio spectrum. The possibility of characterizing        negative emergent behavior as a chaotic phenomenon needs some        explanation. Idealized chaos theory is based on the premise that        dynamic noise in a state-space model describing the phenomenon        of interest is zero. However, it is unlikely that this highly        restrictive condition is satisfied by real-life physical        phenomena. So, the proper thing to say is that it is feasible        for a negative emergent behavior to be stochastic chaotic.

From a practical perspective, what we need are, first, a reliablecriterion for the early detection of negative emergent behavior (i.e.,disorder) and, second, corrective measures for dealing with thisundesirable behavior. With regards to the first issue, we recognize thatcognition, in a sense, is an exercise in assigning probabilities topossible behavioral responses. In light of this, it may be said that inthe case of positive emergent behavior, predictions are possible withnearly complete confidence. On the other hand, in the case of negativeemergent behavior, predictions are made with far less confidence. We maythus think of a likelihood function based on predictability as acriterion for the onset of negative emergent behavior. In particular, weenvision a maximum-likelihood detector, the design of which is based onthe predictability of negative emergent behavior.

Cognitive radio holds the promise of a new frontier in wirelesscommunications. Specifically, with dynamic coordination of a spectrumsharing process, significant “white space” can be created in thespectrum, which in turn makes it possible to improve spectrumutilization under constantly changing user conditions. The dynamicspectrum sharing capability builds on two matters:

-   -   (i) a paradigm shift in wireless communications from        transmitter-centricity to receiver-centricity, whereby        interference power rather than transmitter emission is        regulated; and    -   (ii) awareness of and adaptation to the environment by the        radio.

Cognitive radio is a computer-intensive system, so much so that we maythink of it as a radio with a computer inside or a computer thattransmits. Such a system provides a novel basis for balancing thecommunication and computing needs of a user against those of a networkwith which the user would like to operate. With so much reliance oncomputing, language understanding may play a key role in theorganization of domain knowledge for the cognitive cycle, which mayinclude any or all of the following:

-   -   (i) a wake cycle, as shown in FIG. 1, during which the cognitive        radio supports the tasks of passive radio-scene analysis, active        transmit-power control and dynamic spectrum management, and        possibly other tasks such as channel-state estimation and        predictive modeling;    -   (ii) a sleep cycle, during which incoming stimuli are integrated        into the domain knowledge of a “personal digital assistant”; and    -   (iii) a prayer cycle, which caters to items that cannot be dealt        with during the sleep cycle and may therefore be resolved        through interaction of the cognitive radio with the user in real        time.

It is widely recognized that the use of a MIMO antenna architecture canprovide for a spectacular increase in the spectral efficiency ofwireless communications. With improved spectrum utilization as one ofthe primary objectives of cognitive radio, it seems logical to explorebuilding the MIMO antenna architecture into the design of cognitiveradio. The end result is a cognitive MIMO radio that offers the ultimatein flexibility, which is exemplified by four degrees of freedom: carrierfrequency, channel bandwidth, transmit power, and multiplexing gain.

Turbo processing has established itself as one of the key technologiesfor modern digital communications. In specific terms, turbo processinghas made it possible to provide significant improvements in the signalprocessing operations of channel decoding and channel equalization, bothof which are basic to the design of digital communication systems.Compared to traditional design methodologies, these improvementsmanifest themselves in spectacular reductions in frame error rates forprescribed signal-to-noise ratios. It also seems logical to build turboprocessing into the design of cognitive radio in order to supportQuality of Service (QoS) requirements, for example.

With computing being so central to the implementation of cognitiveradio, it is natural that we keep nanotechnology in mind as we look tothe future. Since the first observation of multi-walled carbon nanotubesin transmission electron microscopy studies, carbon nanotubes have beenexplored extensively in theoretical and experimental studies ofnanotechnology. Nanotubes offer the potential for a paradigm shift fromthe narrow confine of today's information processing based on silicontechnology to a much broader field of information processing, given therich electro-mechano-opto-chemical functionalities that are endowed innanotubes. This paradigm shift may well impact the evolution ofcognitive radio in its own way.

The potential for cognitive radio to make a significant difference towireless communications is immense, hence the reference to it as adisruptive but unobtrusive technology. In the final analysis, however,one key issue that may shape the evolution of cognitive radio in thecourse of time, be that for civilian or military applications, is trust,which is two-fold, including trust by the users of cognitive radio, andtrust by all other users who might be interfered with.

1. A device for controlling transmit power in a multi-user cognitiveradio network, the device comprising: a transceiver and one or moreantennas; and a processor coupled to the transceiver and to a memory,the processor being configured to determine a transmit power level for atransmitter in the transceiver according to an iterative water-fillingprocedure and to monitor transmit behavior of other transmitters using alearning algorithm, transmit behavior of other transmitters beingmonitored through a receiver of the transceiver and the one or moreantennas.
 2. The device of claim 1, wherein the one or more antennascomprise a MIMO (Multiple-Input Multiple Output) antenna architecture.3. The device of claim 1, wherein the device comprises a mobilecommunication device.
 4. The device of claim 1, wherein the devicecomprises a base station.
 5. The device of claim 1, wherein the devicecomprises equipment in a wireless communication system.
 6. The device ofclaim 1, wherein the processor is further configured to implement acognitive radio.
 7. The device of claim 1, wherein the iterativewater-filling procedure comprises: initializing a transmit powerdistribution across n transmitters; performing water-filling for thetransmitter of the transceiver to determine a transmit power level for atarget data transmission rate of the transmitter subject to a powerconstraint for the transmitter and a level of interference, the level ofinterference comprising a noise floor plus either initialized transmitpower levels or previously determined transmit power levels for theother transmitters; determining whether a data transmission rate of thetransmitter is greater than or less than a target data transmission rateof the transmitter, and adjusting the determined transmit power levelfor the transmitter based on the determination; determining whether atarget data transmission rate of at least one of the n transmitters isnot satisfied by a respective adjusted transmit power level for the atleast one transmitter, and repeating the operation of performingwater-filling on determining that the target data transmission rate ofat least one of the n transmitters is not satisfied.
 8. The device ofclaim 7, wherein the processor is further configured to determinewhether the target data transmission rate of at least one of the ntransmitters is not satisfied by determining whether the target datatransmission rates of all of the n transmitters are not satisfied, andto repeat the operation of performing water-filling on determining thatthe target data transmission rates of all of the n transmitters are notsatisfied.
 9. The device of claim 7, wherein the processor is furtherconfigured to adjust the determined transmit power level by: determiningwhether increasing the determined transmit power level of thetransmitter would violate an interference level limit on determiningthat the data transmission rate of the transmitter is less than thetarget data transmission rate of the transmitter; and increasing thedetermined transmit power level of the transmitter on determining thatincreasing the determined transmit power level of the transmitter wouldnot violate an interference level limit.
 10. The device of claim 9,wherein the processor is configured to determine whether increasing thedetermined transmit power level of the transmitter would violate aninterference level limit by determining whether increasing thedetermined transmit power level of the transmitter would violate aninterference level limit within a spectrum hole, and wherein theprocessor is further configured to detect a further spectrum hole and todetermine a further transmit power level for the transmitter fortransmission within the further spectrum hole, on determining that thedata transmission rate of the transmitter is less than the target datatransmission rate of the transmitter and that increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit within the spectrum hole.
 11. The device of claim 1, whereinthe processor is further configured to detect at least one spectrumhole, and to determine a transmit power level by determining a transmitpower level for the transmitter for transmission within the at least onespectrum hole.
 12. The device of claim 11, wherein the at least onespectrum hole comprises a plurality of spectrum holes, and wherein theprocessor is configured to determine a transmit power level bydetermining a set of transmit power levels comprising multiple transmitpower levels for transmission within respective ones of the plurality ofspectrum holes.
 13. The device of claim 11, wherein the processor isfurther configured to predict subsequent availability of the at leastone spectrum hole, and to determine a new transmit power level onpredicting that the at least one spectrum hole is to become unavailable.14. The device of claim 11, wherein the processor is further configuredto detect a further spectrum hole, to monitor interference in the atleast one spectrum hole, and to determine a new transmit power level forthe transmitter for transmission within the further spectrum hole ondetecting an increase in interference in the at least one spectrum hole.15. The device of claim 1, wherein the processor is further configuredto determine a position of the transmitter relative to the othertransmitters, and to determine a multi-user path loss matrix of theoperating environment of the transmitter based on the determinedposition of the transmitter relative to the other transmitters.
 16. Thedevice of claim 1, wherein the learning algorithm comprises aregret-conscious learning algorithm or a Lagrangian learning algorithm.17. A method for controlling transmit power in a multi-user cognitiveradio network, the method comprising: determining a transmit power levelfor a transmitter of a transceiver according to an iterativewater-filling procedure; and monitoring transmit behavior of othertransmitters using a learning algorithm, the monitoring comprisingmonitoring transmit behavior of the other transmitters through areceiver of the transceiver.
 18. The method of claim 17, wherein theiterative water-filling procedure comprises: initializing a transmitpower distribution across n transmitters; performing water-filling forthe transmitter of the transceiver to determine a transmit power levelfor a target data transmission rate of the transmitter subject to apower constraint for the transmitter and a level of interference, thelevel of interference comprising a noise floor plus either initializedtransmit power levels or previously determined transmit power levels forthe other transmitters; determining whether a data transmission rate ofthe transmitter is greater than or less than a target data transmissionrate of the transmitter, and adjusting the determined transmit powerlevel for the transmitter based on the determination; determiningwhether a target data transmission rate of at least one of the ntransmitters is not satisfied by a respective adjusted transmit powerlevel for the at least one transmitter, and repeating the operation ofperforming water-filling on determining that the target datatransmission rate of at least one of the n transmitters is notsatisfied.
 19. The method of claim 18, wherein the performing,determining whether a data transmission rate of a transmitter is greaterthan or less than a target transmission rate, and adjusting are repeatedfor each of the other transmitters.
 20. The method of claim 19, whereindetermining whether the target data transmission rate of at least one ofthe n transmitters is not satisfied comprises determining whether thetarget data transmission rates of all of the n transmitters are notsatisfied, and repeating the operation of performing water-filling ondetermining that the target data transmission rates of all of the ntransmitters are not satisfied.
 21. The method of claim 18, whereinadjusting comprises: determining whether increasing the determinedtransmit power level of the transmitter would violate an interferencelevel limit on determining that the data transmission rate of thetransmitter is less than the target data transmission rate of thetransmitter; and increasing the determined transmit power level of thetransmitter on determining that increasing the determined transmit powerlevel of the transmitter would not violate an interference level limit.22. The method of claim 21, wherein determining whether increasing thedetermined transmit power level of the transmitter would violate aninterference level limit comprises determining whether increasing thedetermined transmit power level of the transmitter would violate aninterference level limit within a spectrum hole, the method furthercomprising: detecting a further spectrum hole; and determining a furthertransmit power level for the transmitter for transmission within thefurther spectrum hole, on determining that the data transmission rate ofthe transmitter is less than the target data transmission rate of thetransmitter and that increasing the determined transmit power level ofthe transmitter would violate an interference level limit within thespectrum hole.
 23. The method of claim 21, further comprising: adaptinga modulation strategy for transmission of data by the transmitter ondetermining that the data transmission rate of the transmitter is lessthan the target data transmission rate of the transmitter and increasingthe determined transmit power level of the transmitter would violate aninterference level limit.
 24. The method of claim 17, furthercomprising: detecting at least one spectrum hole, wherein determining atransmit power level comprises determining a transmit power level forthe transmitter for transmission within the at least one spectrum hole.25. The method of claim 24, wherein the at least one spectrum holecomprises a plurality of spectrum holes, and wherein determining atransmit power level comprises determining a set of transmit powerlevels comprising multiple transmit power levels for transmission withinrespective ones of the plurality of spectrum holes.
 26. The method ofclaim 24, further comprising: predicting subsequent availability of theat least one spectrum hole; and determining a new transmit power levelon predicting that the at least one spectrum hole is to becomeunavailable.
 27. The method of claim 24, further comprising: detecting afurther spectrum hole; monitoring interference in the at least onespectrum hole; and determining a new transmit power level for thetransmitter for transmission within the further spectrum hole ondetecting an increase in interference in the at least one spectrum hole.28. The method of claim 17, further comprising: determining a positionof the transmitter relative to the other transmitters; and determining amulti-user path loss matrix of the operating environment of thetransmitter based on the determined position of the transmitter relativeto the other transmitters.
 29. The method of claim 17, wherein thelearning algorithm comprises a regret-conscious learning algorithm or aLagrangian learning algorithm.
 30. A non-transitory machine-readablemedium storing instructions which when executed perform the method ofclaim
 17. 31-153. (canceled)